Conversation
hunterhector
left a comment
There was a problem hiding this comment.
The whole thing is still non-forte, without clear documentation of how to use UDA.
If a user looks at this example, he wouldn't know how to incorporate UDA. There is not explanation in the readme, no comments in the main.py. All he can see is a 500 line code and don't know where to start, or don't know which part is important.
Previous comments on the last PR are not fixed, such as calling subprocess in Python, and there is no use of Forte at all. Why would I need this framework instead of going to Google's code?
Codecov Report
@@ Coverage Diff @@
## master #343 +/- ##
==========================================
- Coverage 81.87% 81.78% -0.10%
==========================================
Files 187 187
Lines 11941 11940 -1
==========================================
- Hits 9777 9765 -12
- Misses 2164 2175 +11
Continue to review full report at Codecov.
|
hunterhector
left a comment
There was a problem hiding this comment.
I think we'd better place all data aug examples in the same folder: https://github.com/asyml/forte/tree/master/examples/data_augmentation
There was a problem hiding this comment.
This PR is still at a very low quality, which cannot be merged at its current state? For example, where is the step to generate the back translation data? How can the user do it on his own?
Here are a list of problems:
- there's no variable type
- there is no instruction on how to get the back translation data.
- A lot of code is directly copied from Google.
- All the docstrings do not follow our standard.
- After mentioning using the training team's format, there is no change, there are still a lot of ad-hoc code such as
InputFeatures,InputExample.
If that's the quality of this work, there's no reason that a user would use this instead of Google's implementation.
hunterhector
left a comment
There was a problem hiding this comment.
I only see back_trans folder with the requirement file, I guess you need to refer to Google?
Another way is to document how to get back translation from Google with steps, but not copying the code over.
Again, most docstrings are not following our standard.
Sure. It's still a work in progress. I will add instructions on how to install |
This PR fixes #293.
Description of changes
This PR adds an example that uses UDA to train a text classifier on the IMDB text classification dataset. Please see the README for details.
Test Conducted
Performed experiments with/without UDA, under supervised and semi-supervised settings.